MINOR: Generate expected benchmark query results #4010

andygrove · 2022-10-29T01:05:40Z

Which issue does this PR close?

N/A

Rationale for this change

We really need to start checking that the benchmark queries are producing the correct results

What changes are included in this PR?

Update tpch-gen.sh to:
- Use an available docker image instead of maintaining our own
- Write the expected query results (at SF=1) into ./data/answers

Are there any user-facing changes?

No

Dandandan · 2022-10-29T10:34:47Z

Looks great. I tried running the tests now against SF=1. I also had SF=10 running, but that one fails with more errors (my feeling is mostly rounding / overflow errors) and quite slow to read the files in CSV format.

This is what I get with TPCH_DATA=./tpch-dbgen/ cargo test --release -- --test-threads=1

running 66 tests
test tests::q1 ... ok
test tests::q10 ... ok
test tests::q10_expected_plan ... ok
test tests::q11 ... FAILED
test tests::q11_expected_plan ... ok
test tests::q12 ... ok
test tests::q12_expected_plan ... ok
test tests::q13 ... ok
test tests::q13_expected_plan ... ok
test tests::q14 ... FAILED
test tests::q14_expected_plan ... ok
test tests::q15 ... ok
test tests::q15_expected_plan ... ok
test tests::q16 ... ok
test tests::q16_expected_plan ... ok
test tests::q17 ... FAILED
test tests::q17_expected_plan ... ignored
test tests::q18 ... ok
test tests::q18_expected_plan ... ok
test tests::q19 ... ok
test tests::q19_expected_plan ... ok
test tests::q1_expected_plan ... ok
test tests::q2 ... ok
test tests::q20 ... ok
test tests::q20_expected_plan ... ok
test tests::q21 ... ok
test tests::q21_expected_plan ... ok
test tests::q22 ... ok
test tests::q22_expected_plan ... ok
test tests::q2_expected_plan ... ok
test tests::q3 ... ok
test tests::q3_expected_plan ... ok
test tests::q4 ... ok
test tests::q4_expected_plan ... ok
test tests::q5 ... ok
test tests::q5_expected_plan ... ok
test tests::q6 ... FAILED
test tests::q6_expected_plan ... ok
test tests::q7 ... ok
test tests::q7_expected_plan ... ok
test tests::q8 ... ok
test tests::q8_expected_plan ... ok
test tests::q9 ... FAILED
test tests::q9_expected_plan ... ok
test tests::run_q1 ... ok
test tests::run_q10 ... ok
test tests::run_q11 ... ok
test tests::run_q12 ... ok
test tests::run_q13 ... ok
test tests::run_q14 ... ok
test tests::run_q15 ... ok
test tests::run_q16 ... ok
test tests::run_q17 ... ok
test tests::run_q18 ... ok
test tests::run_q19 ... ok
test tests::run_q2 ... ok
test tests::run_q20 ... ok
test tests::run_q21 ... ok
test tests::run_q22 ... ok
test tests::run_q3 ... ok
test tests::run_q4 ... ok
test tests::run_q5 ... ok
test tests::run_q6 ... ok
test tests::run_q7 ... ok
test tests::run_q8 ... ok
test tests::run_q9 ... ok

failures:

---- tests::q11 stdout ----
Running benchmarks with the following options: DataFusionBenchmarkOpt { query: 11, debug: false, iterations: 1, partitions: 2, batch_size: 8192, path: "./tpch-dbgen/", file_format: "tbl", mem_table: false, output_path: None, disable_statistics: false }
Query 11 iteration 0 took 824.6 ms and returned 29571 rows
Query 11 avg time: 824.63 ms
thread 'main' panicked at 'assertion failed: `(left == right)`
  left: `1048`,
 right: `29571`', benchmarks/src/bin/tpch.rs:1411:13
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

---- tests::q14 stdout ----
Running benchmarks with the following options: DataFusionBenchmarkOpt { query: 14, debug: false, iterations: 1, partitions: 2, batch_size: 8192, path: "./tpch-dbgen/", file_format: "tbl", mem_table: false, output_path: None, disable_statistics: false }
Query 14 iteration 0 took 2749.8 ms and returned 1 rows
Query 14 avg time: 2749.84 ms
thread 'main' panicked at 'assertion failed: schema_matches', benchmarks/src/bin/tpch.rs:1404:13

---- tests::q17 stdout ----
Running benchmarks with the following options: DataFusionBenchmarkOpt { query: 17, debug: false, iterations: 1, partitions: 2, batch_size: 8192, path: "./tpch-dbgen/", file_format: "tbl", mem_table: false, output_path: None, disable_statistics: false }
Query 17 iteration 0 took 9599.6 ms and returned 1 rows
Query 17 avg time: 9599.62 ms
thread 'main' panicked at 'assertion failed: schema_matches', benchmarks/src/bin/tpch.rs:1404:13

---- tests::q6 stdout ----
Running benchmarks with the following options: DataFusionBenchmarkOpt { query: 6, debug: false, iterations: 1, partitions: 2, batch_size: 8192, path: "./tpch-dbgen/", file_format: "tbl", mem_table: false, output_path: None, disable_statistics: false }
Query 6 iteration 0 took 2957.9 ms and returned 1 rows
Query 6 avg time: 2957.93 ms
thread 'main' panicked at 'assertion failed: `(left == right)`
  left: `["123141078.23"]`,
 right: `["75207429.08"]`', benchmarks/src/bin/tpch.rs:1415:17

---- tests::q9 stdout ----
Running benchmarks with the following options: DataFusionBenchmarkOpt { query: 9, debug: false, iterations: 1, partitions: 2, batch_size: 8192, path: "./tpch-dbgen/", file_format: "tbl", mem_table: false, output_path: None, disable_statistics: false }
Query 9 iteration 0 took 3962.8 ms and returned 175 rows
Query 9 avg time: 3962.77 ms
thread 'main' panicked at 'assertion failed: `(left == right)`
  left: `["MOROCCO", "1997", "42698382.85"]`,
 right: `["MOROCCO", "1997", "42698382.86"]`', benchmarks/src/bin/tpch.rs:1415:17


failures:
    tests::q11
    tests::q14
    tests::q17
    tests::q6
    tests::q9

andygrove · 2022-10-29T12:13:52Z

Thanks for the review. This matches what I am seeing. My goal for today is to get these tests running in CI.

ursabot · 2022-10-29T12:21:50Z

Benchmark runs are scheduled for baseline = 3452345 and contender = 71f05a3. 71f05a3 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

Generate expected benchmark query results

9022bab

Dandandan approved these changes Oct 29, 2022

View reviewed changes

andygrove merged commit 71f05a3 into apache:master Oct 29, 2022

andygrove deleted the generate-bench-answers branch October 29, 2022 12:14

jimexist pushed a commit to jimexist/arrow-datafusion that referenced this pull request Oct 31, 2022

Generate expected benchmark query results (apache#4010)

3f1e9ca

Dandandan pushed a commit to yuuch/arrow-datafusion that referenced this pull request Nov 5, 2022

Generate expected benchmark query results (apache#4010)

c783b9e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MINOR: Generate expected benchmark query results #4010

MINOR: Generate expected benchmark query results #4010

andygrove commented Oct 29, 2022

Dandandan commented Oct 29, 2022 •

edited

Loading

andygrove commented Oct 29, 2022

ursabot commented Oct 29, 2022

MINOR: Generate expected benchmark query results #4010

MINOR: Generate expected benchmark query results #4010

Conversation

andygrove commented Oct 29, 2022

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

Dandandan commented Oct 29, 2022 • edited Loading

andygrove commented Oct 29, 2022

ursabot commented Oct 29, 2022

Dandandan commented Oct 29, 2022 •

edited

Loading